Univariate - MA Data Analysis

Univariable —

Open days

##   obs_days open_days closed_days
## 1      169         8         161
## # A tibble: 2 × 3
##   is_closed     n  prop
##   <lgl>     <int> <dbl>
## 1 FALSE       161  95.3
## 2 TRUE          8   4.7
is_closed N Percent
FALSE 161 95.27
TRUE 8 4.73
All 169 100.00
## Warning: To compile a LaTeX document with this table, the following commands must be placed in the document preamble:
## 
## \usepackage{booktabs}
## \usepackage{siunitx}
## \newcolumntype{d}{S[
##     input-open-uncertainty=,
##     input-close-uncertainty=,
##     parse-numbers = false,
##     table-align-text-pre=false,
##     table-align-text-post=false
##  ]}
## 
## To disable `siunitx` and prevent `modelsummary` from wrapping numeric entries in `\num{}`, call:
## 
## options("modelsummary_format_numeric_latex" = "plain")
##  This warning appears once per session.
## 
## Attaching package: 'parameters'
## The following object is masked from 'package:modelsummary':
## 
##     supported_models
## Skewness |    SE
## ----------------
##    0.726 | 0.185
## Skewness |    SE
## ----------------
##    1.327 | 0.185
## Skewness |    SE
## ----------------
##    0.669 | 0.185

Basic Summary of Dependent Variables

## # A tibble: 4 × 13
##   variable        n   min   max median    q1    q3   iqr   mad  mean    sd    se
##   <fct>       <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 food_loss_…   161     0 13.8    7.35  6.7   8.4   1.7  1.11  7.83  2.17  0.171
## 2 food_waste…   161     0  6.55   2.1   1.1   2.95  1.85 1.33  2.19  1.40  0.111
## 3 liquid_was…   161     0  4.5    1.5   0.65  2.05  1.4  1.04  1.48  0.995 0.078
## 4 solid_wast…   161     0  2.95   0.65  0.35  0.95  0.6  0.445 0.708 0.499 0.039
## # ℹ 1 more variable: ci <dbl>

Histograms —

X Histogram with density

## Saving 8 x 5 in image

#### Q-Q plot

## Saving 8 x 5 in image

X shapiro test

## # A tibble: 3 × 3
##   variable        statistic             p
##   <chr>               <dbl>         <dbl>
## 1 food_waste_kg       0.952 0.0000260    
## 2 liquid_waste_kg     0.951 0.0000192    
## 3 solid_waste_kg      0.903 0.00000000783

From the output, all the p-value is far less than 0.05; so implying that the distribution of the data are significantly different from normal distribution. In other words, we can not assume the normality.

Histogram Food Waste per customer

Q-Q plot Food Waste per customer

shapiro test for per customer

## # A tibble: 3 × 3
##   variable          statistic        p
##   <chr>                 <dbl>    <dbl>
## 1 food_waste_p_kg       0.987 1.38e- 1
## 2 liquid_waste_p_kg     0.984 6.10e- 2
## 3 solid_waste_p_kg      0.863 6.24e-11

From the output, the p-value of solid food waste per customer is far less that the significant level of 0.05; but the others are not. So it imply that the distribution of the data for solid food waste per customer is significantly different from normal distribution. In other words, we can assume the normality for food waste and liquid food waste per customer but not for solid food waste.

Histogram logged Food Waste

Q-Q plot logged Food Waste

shapiro test for per customer

## # A tibble: 3 × 3
##   variable            statistic       p
##   <chr>                   <dbl>   <dbl>
## 1 log_food_waste_kg       0.979 0.0153 
## 2 log_liquid_waste_kg     0.972 0.00208
## 3 log_solid_waste_kg      0.979 0.0166

Time Series Plots —

Daily Time Series

## Saving 8 x 5 in image

Daily plot per customer

Decompsiotion

## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,0,2) with non-zero mean : 595.2761
##  ARIMA(0,0,0) with non-zero mean : 607.2775
##  ARIMA(1,0,0) with non-zero mean : 598.3493
##  ARIMA(0,0,1) with non-zero mean : 606.2906
##  ARIMA(0,0,0) with zero mean     : 795.7987
##  ARIMA(1,0,2) with non-zero mean : 593.7226
##  ARIMA(0,0,2) with non-zero mean : 603.5818
##  ARIMA(1,0,1) with non-zero mean : 598.3892
##  ARIMA(1,0,3) with non-zero mean : 594.7845
##  ARIMA(0,0,3) with non-zero mean : 602.7266
##  ARIMA(2,0,1) with non-zero mean : 593.1346
##  ARIMA(2,0,0) with non-zero mean : 593.03
##  ARIMA(3,0,0) with non-zero mean : 591.0829
##  ARIMA(4,0,0) with non-zero mean : 593.9004
##  ARIMA(3,0,1) with non-zero mean : 593.1032
##  ARIMA(4,0,1) with non-zero mean : 594.6705
##  ARIMA(3,0,0) with zero mean     : 655.5828
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(3,0,0) with non-zero mean : 600.6932
## 
##  Best model: ARIMA(3,0,0) with non-zero mean
## Series: df$food_waste_kg 
## ARIMA(3,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2      ar3    mean
##       0.1053  -0.2083  -0.1262  2.0746
## s.e.  0.0788   0.0769   0.0786  0.0871
## 
## sigma^2 = 1.97:  log likelihood = -295.16
## AIC=600.33   AICc=600.69   BIC=615.97
## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,0,2) with non-zero mean : 242.2204
##  ARIMA(0,0,0) with non-zero mean : 254.9591
##  ARIMA(1,0,0) with non-zero mean : 242.9804
##  ARIMA(0,0,1) with non-zero mean : 254.9337
##  ARIMA(0,0,0) with zero mean     : 424.4576
##  ARIMA(1,0,2) with non-zero mean : 240.5345
##  ARIMA(0,0,2) with non-zero mean : 253.0456
##  ARIMA(1,0,1) with non-zero mean : 242.4608
##  ARIMA(1,0,3) with non-zero mean : 241.1252
##  ARIMA(0,0,3) with non-zero mean : 252.9766
##  ARIMA(2,0,1) with non-zero mean : 240.7382
##  ARIMA(2,0,3) with non-zero mean : 243.1306
##  ARIMA(1,0,2) with zero mean     : 290.294
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(1,0,2) with non-zero mean : 252.8433
## 
##  Best model: ARIMA(1,0,2) with non-zero mean
## Series: df$solid_waste_kg 
## ARIMA(1,0,2) with non-zero mean 
## 
## Coefficients:
##          ar1      ma1      ma2    mean
##       0.3933  -0.3011  -0.2195  0.6723
## s.e.  0.2334   0.2269   0.0728  0.0303
## 
## sigma^2 = 0.2516:  log likelihood = -121.24
## AIC=252.48   AICc=252.84   BIC=268.12
## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,0,2) with non-zero mean : 481.848
##  ARIMA(0,0,0) with non-zero mean : 489.7931
##  ARIMA(1,0,0) with non-zero mean : 483.6428
##  ARIMA(0,0,1) with non-zero mean : 488.6056
##  ARIMA(0,0,0) with zero mean     : 668.5145
##  ARIMA(1,0,2) with non-zero mean : 481.4292
##  ARIMA(0,0,2) with non-zero mean : 487.558
##  ARIMA(1,0,1) with non-zero mean : 484.5832
##  ARIMA(1,0,3) with non-zero mean : 482.8695
##  ARIMA(0,0,3) with non-zero mean : 487.0004
##  ARIMA(2,0,1) with non-zero mean : 480.5155
##  ARIMA(2,0,0) with non-zero mean : 480.0232
##  ARIMA(3,0,0) with non-zero mean : 478.3711
##  ARIMA(4,0,0) with non-zero mean : 480.7297
##  ARIMA(3,0,1) with non-zero mean : 480.1401
##  ARIMA(4,0,1) with non-zero mean : 479.0072
##  ARIMA(3,0,0) with zero mean     : 539.5893
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(3,0,0) with non-zero mean : 484.9027
## 
##  Best model: ARIMA(3,0,0) with non-zero mean
## Series: df$liquid_waste_kg 
## ARIMA(3,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2     ar3    mean
##       0.1128  -0.1804  -0.124  1.4030
## s.e.  0.0780   0.0767   0.078  0.0638
## 
## sigma^2 = 0.9932:  log likelihood = -237.27
## AIC=484.53   AICc=484.9   BIC=500.18

Boxplots - weekly

Boxplots per customer - weekly

bar plot - weekly

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning in stat_summary(fun = mean, geom = "bar", shape = 16, size = 3): Ignoring unknown parameters: `shape`
## Ignoring unknown parameters: `shape`
## Ignoring unknown parameters: `shape`

Boxplot - monthly

## Boxplot per customer - monthly

Time Series Plots for Independents

(Partial and) Autocorrelation Function

Spectral Analysis

## [1] 3.214286
## [1] 5.294118
## [1] 5.142857
## [1] 5.294118

roughly 6 (days) period for food waste, but food loss is approx. 3 days or 20 days cycle.